-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add cANI estimation to branchwater #188
Conversation
@ctb @bluegenes
@bluegenes, Please let me know if you find any issues in the ANI code, so I can help fix them if needed. |
As long as it doesn't degrade performance, I think adding it as a default column is good! |
pairwise command before adding ANI
pairwise command after adding ANI
|
@bluegenes Am I correctly calculating ANI from containment here? https://github.com/sourmash-bio/sourmash_plugin_branchwater/pull/188/files#diff-d62689bb23b648c8b52cd5f21c1bf1cf3974ef38477c5ca006c9b1741d380aa8R74-R107 |
is that really a 2x hit in time!? |
This is an excellent opportunity for a test: calculate containment with regular sourmash code, then calculate it with your code, then compare the numbers... |
Yes :/
I can load the comparisons CSV output and use the containment values (max, min) to calculate avg ani with sourmash then validate with the new rust ANI column. Is that a valid test? |
Pretty much anything can be a valid test, but it sounds more roundabout than (a) running a comparison with sourmash, (b) saving the result, (c) writing a test that runs a comparison with branchwater and checks that you get approximately the same value? Hardcoding the known-good answer into a test is fine. |
hi @mr-eyes -- I'm moving ANI calculation into rust in sourmash-bio/sourmash#2943. Would you like to submit a PR with the equations file into that PR? I have some modifications I'll make, but I can do it over there. If not, lmk and I'll just add the code directly cc @ctb |
I don't mind copying or modifying the code. Although I would love to contribute to the amazing PR, I am not currently able to commit. Please go for it. |
you just run 😁 |
Hahaha, ok as long as my committed code will not be judged |
@bluegenes @ctb |
Closing in favor of #236 |
Adds ANI columns to pairwise and multisearch output, building off of @mr-eyes's ANI translations (#188) which got streamlined + added into sourmash core in sourmash-bio/sourmash#2943 Split from #234 to make things more concise/simpler. ## benchmark summary ## 12k ICTV viral genomes, scaled=200 79.8m comparisons | version | experiment | time | | -------- | -------- | -------- | | PR | no ANI | 21s | | PR | with ANI | 20s | | v0.9.0 | no ANI | 21s | ## 12k ICTV viral genomes, scaled=10 79.8m comparisons | version | experiment | time | | -------- | -------- | -------- | | PR | no ANI | 14m 0s | | PR | with ANI | 14m 6s | | main branch (~v0.9.0) | no ANI | 14m 47s |
resolves #164